Data Mining and Knowledge Discovery

Â
Data Mining
and
Knowledge Discovery

John F. Sowa
VivoMind LLC

Panel Discussion
4 July 2003

Â
Panelists

Today's speakers:

Guy Mineau, UniversitÃ© Laval
Graph Structures:Â Simple Models, Complex Processing
Yves Kodratoff, UniversitÃ© Paris-Sud XI
Text Mining:Â From Text Retrieval to Knowledge Extraction
Amadeo Napoli, LORIA
Knowledge Discovery in Databases
Jean-Guy Meunier, UQÃ€M
Comuputer Text Categorization

Speaker on Tuesday, July 8th:

Ryszard Michalski, George Mason University
Inferential Theory of Learning

Â
Peirce's Logic of Pragmatism

Â
Relating Logic to the World

George Box:Â "All models are wrong, but some are useful."

Â
Branches of Semeiotic

Peirce's classification:

Grammar:Â Patterns of signs at every level of complexity in every sensory modality.
Logic:Â Formal conditions for the truth of representations.
Methodeutic:Â Methods of observation, experiment, and testing for relating signs to their referents in science, engineering, and everday life.

Â
Major Challenge

Computer systems are very good at deduction.
They can process large volumes of data for induction and abduction.
But they cannot compete with a child in learning language.
Why not?

Â
Replacing Sherlock Holmes

Â
Paradox of Information Retrieval

People try to understand a document before classifying it.
They try to understand a question before answering it.
Since the 1950s, computational linguists have been developing sophisticated methods for information retrieval.
But the most successful methods use little or no linguistics.

Â
Paradox of Machine Translation

Human translators must know the subject matter.
Research on knowledge-based MT since the 1970s.
But the most widely used MT system is SYSTRAN:
Originally called GAT (Georgetown Automatic Translator).
Research terminated in 1963.
Uses a very big dictionary and very little linguistics.
Now called Babelfish for translating WWW pages.

Â
Paradox of Machine Learning

Human knowledge is expressed in complex structures.
But most machine learning systems use very simple structures:

Boolean combinations
Adjusting numerical weights
Vectors of features
How can a system learn the structures that occur in language?

Â
Utterance by a 3-year-old Child

When I was a little girl, I could go "geek, geek" like that; but now I can go "This is a chair."
Enormous logical complexity in one short passage:

Subordinate and coordinate clauses
Tenses:Â Earlier time contrasted with "now"
Modal auxiliaries:Â can and could
Quotations:Â "geek, geek" and "This is a chair"
Metalanguage about her own linguistic abilities
Contrast shown by but
Parallel stylistic structure

Â
A Typical Neural Network

Fixed set of features, concepts, nodes, and arcs.
Learning is limited to adjusting weights.
Such a structure cannot learn a language.

Â
Questions for the Panelists

Why hasn't linguistics helped information retrieval?
Why aren't richer structures used in machine learning?
How could richer learning systems be designed?
How could other branches of cognitive science
(a) contribute to research in machine learning?
(b) benefit from research in machine learning?

What are the prospects for the future?

Â
References

Slides presented on the opening day:
http://www.jfsowa.com/talks/uqam.htm

Paper on analogical reasoning by Sowa and Majumdar:
http://www.jfsowa.com/pubs/analog.htm

Peirce's tutorial on existential graphs, with commentary by Sowa:
http://www.jfsowa.com/peirce/ms514.htm

Selected papers by Peirce on semeiotic and related topics; see his 1903 lectures on pragmatism in vol. 2 for material related to this talk:
Peirce, Charles Sanders (EP) The Essential Peirce, ed. by N. Houser, C. Kloesel, and members of the Peirce Edition Project, 2 vols., Indiana University Press, Bloomington, 1991-1998.